A simple non-parametric Topic Mixture for Authors and Documents

نویسنده

  • Arnim Bleier
چکیده

This article reviews the Author-Topic Model and presents a new non-parametric extension based on the Hierarchical Dirichlet Process. The extension is especially suitable when no prior information about the number of components necessary is available. A blocked Gibbs sampler is described and focus put on staying as close as possible to the original model with only the minimum of theoretical and implementation overhead necessary.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Nested Hierarchical Dirichlet Processes for Multi-Level Non-Parametric Admixture Modeling

Dirichlet Process(DP) is a Bayesian non-parametric prior for infinite mixture modeling, where the number of mixture components grows with the number of data items. The Hierarchical Dirichlet Process (HDP), often used for non-parametric topic modeling, is an extension of DP for grouped data, where each group is a mixture over shared mixture densities. The Nested Dirichlet Process (nDP), on the o...

متن کامل

Bayesian Document Generative Model with Explicit Multiple Topics

In this paper, we proposed a novel probabilistic generative model to deal with explicit multiple-topic documents: Parametric Dirichlet Mixture Model(PDMM). PDMM is an expansion of an existing probabilistic generative model: Parametric Mixture Model(PMM) by hierarchical Bayes model. PMM models multiple-topic documents by mixing model parameters of each single topic with an equal mixture ratio. P...

متن کامل

Drawing Co-Citation Networks of Corona Virus Studies

Background and Aim: The purpose of the present study is to map the coronavirus domain citation network to better understand this domain based on all other citation networks.  Materials and Methods: The present study is applied in terms of purpose, and is descriptive scientometrics in terms of type, which has been done with the all-citation method. In this study, all scientific publications on ...

متن کامل

The Author-Topic Model for Authors and Documents

We introduce the author-topic model, a generative model for documents that extends Latent Dirichlet Allocation (LDA; Blei, Ng, & Jordan, 2003) to include authorship information. Each author is associated with a multinomial distribution over topics and each topic is associated with a multinomial distribution over words. A document with multiple authors is modeled as a distribution over topics th...

متن کامل

Dynamic Non-Parametric Mixture Models and The Recurrent Chinese Restaurant Processa

Dirichlet process mixture models provide a flexible Bayesian framework for density estimation; however they are inadequate with respect to modeling sequential data due to the full exchangeability assumption they employ. In this paper we present the temporal Dirichlet process mixture model (TDPM) as a framework for modeling complex longitudinal data. In a TDPM, the data is divided into epochs; a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1211.6248  شماره 

صفحات  -

تاریخ انتشار 2012